Corpus-based Evaluation of a French Spelling and Grammar Checker

نویسندگان

  • Marianne Starlander
  • Andrei Popescu-Belis
چکیده

This article describes an evaluation method for spelling and grammar checkers and gives the results of its application to two French checkers. The evaluation process follows closely the ISO/IEC and EAGLES guidelines, and defines precisely the evaluation metrics, so that they can be easily reproduced. The choice of professional translators as user profile entails the use of a corpus of spelling mistakes, which was collected and annotated. The metrics are divided into three sets: classification of perfect vs. imperfect sentences; detection of mistakes; correction of mistakes. The results show in which respect the two systems are the most adapted to the user needs, and the points on which they could be improved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Corpus - based Collocation Assistant for Swedish Text

A collocation is a recurrent combination of words, such as commit a crime, whose meaning is fairly transparent but which can not be changed arbitrarily, even if the rules of grammar are adhered to and the individual meanings of the words are preserved. If a writer tries to use do a crime, the job of a collocation assistant is to suggest a better alternative such as commit a crime just like a sp...

متن کامل

Cysill Ar-lein: A Corpus of Written Contemporary Welsh Compiled from an On-line Spelling and Grammar Checker

This paper describes the use of a free, on-line language spelling and grammar checking aid as a vehicle for the collection of a significant (31 million words and rising) corpus of text for academic research in the context of less resourced languages where such data in sufficient quantities are often unavailable. It describes two versions of the corpus: the texts as submitted, prior to the corre...

متن کامل

Spelling Checker-based Language Identification for the Eleven Official South African Languages

Language identification is often the first step when compiling corpora from web pages or other unstructured sources. In this paper, an effective and accurate method for identification of all eleven official South African languages is presented. The method is based on reusing commercial spelling checkers and consists of a multi-stage architecture that is described in detail. We describe the impl...

متن کامل

Statistical Machine Translation as a Grammar Checker for Persian Language

Existence of automatic writing assistance tools such as spell and grammar checker/corrector can help in increasing electronic texts with higher quality by removing noises and cleaning the sentences. Different kinds of errors in a text can be categorized into spelling, grammatical and real-word errors. In this article, the concepts of an automatic grammar checker for Persian (Farsi) language, is...

متن کامل

An Argumentative Approach to Assessing Natural Language Usage based on the Web Corpus

In spite of the significant evolution of spelling and grammar checkers for word-processing software, the problem of judging the appropriateness of language usage in different contexts remains to a large extent still unsolved. This paper presents a novel, argumentative approach to providing proactive assistance for language usage assessment on the basis of the web linguistic corpus. A defeasible...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002